DeCour: a corpus of DEceptive statements in Italian COURts
نویسندگان
چکیده
In criminal proceedings, sometimes it is not easy to evaluate the sincerity of oral testimonies. DECOUR DEception in COURt corpus has been built with the aim of training models suitable to discriminate, from a stylometric point of view, between sincere and deceptive statements. DECOUR is a collection of hearings held in four Italian Courts, in which the speakers lie in front of the judge. These hearings become the object of a specific criminal proceeding for calumny or false testimony, in which the deceptiveness of the statements of the defendant is ascertained. Thanks to the final Court judgment, that points out which lies are told, each utterance of the corpus has been annotated as true, uncertain or false, according to its degree of truthfulness. Since the judgment of deceptiveness follows a judicial inquiry, the annotation has been realized with a greater degree of confidence than ever before. Moreover, in Italy this is the first corpus of deceptive texts not relying on ‘mock’ lies created in laboratory conditions, but which has been collected in a natural environment.
منابع مشابه
Deception Detection in Italian Court testimonies
Effective methods for evaluating the reliability of statements issued by witnesses and defendants in hearings would be extremely valuable to decision-making in Court and other legal settings. In recent years, methods relying on stylometric techniques have proven most successful for this task; but few such methods have been tested with language collected in real-life situations of high-stakes de...
متن کاملOn the Use of Homogenous Sets of Subjects in Deceptive Language Analysis
Recent studies on deceptive language suggest that machine learning algorithms can be employed with good results for classification of texts as truthful or untruthful. However, the models presented so far do not attempt to take advantage of the differences between subjects. In this paper, models have been trained in order to classify statements issued in Court as false or not-false, not only tak...
متن کاملVerification and Implementation of Language-Based Deception Indicators in Civil and Criminal Narratives
Our goal is to use natural language processing to identify deceptive and nondeceptive passages in transcribed narratives. We begin by motivating an analysis of language-based deception that relies on specific linguistic indicators to discover deceptive statements. The indicator tags are assigned to a document using a mix of automated and manual methods. Once the tags are assigned, an interprete...
متن کاملIntonational features for identifying regional accents of Italian
Aim of this paper is providing a preliminary account of some intonational features useful for identifying a large number of Italian accents, estimated as representative of Italian regional variation, by analysing a corpus of comparable speech materials consisting of Map Task dialogues. Analysis concentrates on the intonational characteristics of yes-no questions, which can be realised very diff...
متن کاملDistinguishing deceptive from non-deceptive speech
To date, studies of deceptive speech have largely been confined to descriptive studies and observations from subjects, researchers, or practitioners, with few empirical studies of the specific lexical or acoustic/prosodic features which may characterize deceptive speech. We present results from a study seeking to distinguish deceptive from non-deceptive speech using machine learning techniques ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012